Importing Libraries

Parsing Data and Getting Metrics of Features

Heatmap to show correlations between features

Values are standardized to simplify clustering process

Initial KMeans Model

Scikitplot module is used to plot SSE values and clustering durations from 1 to 11 clusters

SSE values are shown based on number of clusters in dataframe

PCA is used with six components to create silhouette plot through cluster labels

Features 13 and 12 have the greatest correlation before standardization, and correlation between feature 8 and 11 had the second greatest correlation before standardization

Testing is done through 3D scatter plot showing all three features catagorized by number of clusters in fitted and predicted model